- Notifications
You must be signed in to change notification settings - Fork 849
/
Copy pathOML4SQL Procedure for Importing Data to ADB.dsnb
executable file
·1 lines (1 loc) · 16.6 KB
/
OML4SQL Procedure for Importing Data to ADB.dsnb
1
[{"layout":null,"template":null,"templateConfig":null,"name":"OML4SQL Procedure for Importing Data to ADB","description":null,"readOnly":false,"type":"low","paragraphs":[{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":null,"title":null,"message":["%md"," "],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":true,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":null,"message":["%md","","# Importing Data from Oracle Cloud Object Storage","","So, you've signed up for an Oracle Cloud account and you want to start exploring Oracle Machine Learning in Autonomous Database. The first step is to import your data, and there are a few different options to choose from. For example, Oracle Database Actions provides a friendly web-based interface to load data from local files, cloud storage, or from remote databases. Another option for loading data from local files is to use SQL*Loader. For moving data between Oracle Database and Autonomous Database, Oracle Data Pump offers very fast data movement. ","","But how do you import data into Oracle Autonomous Database without leaving the Oracle Machine Learning Notebooks environment? The answer is to use the `DBMS_CLOUD` package.","","`DBMS_CLOUD` provides a simple way to interact with an object store from the database. It supports loading data files from Oracle Cloud Infrastructure Object Storage, Github repositories, Amazon AWS S3, Microsoft Azure Cloud Storage, and Google Cloud Storage. ","","In this example, we demonstrate using a custom procedure `IMPORT_DATA_OBJSTORE_DATASET`, which leverages `DBMS_CLOUD`, to import data from an Oracle Cloud Object Storage bucket to an Autonomous Database table. When using `DBMS_CLOUD.COPY_DATA` alone, the target table must be created in advance. With `IMPORT_DATA_OBJSTORE_DATASET`, you simply provide the dataset name, and the table will be created and the data inserted in a single step. This custom procedure was created to facilitate easy import of the datasets used in Oracle Machine Learning notebooks and demos.","","For a one-off approach to importing specific datasets, refer to the `OML Run-me-first` notebook in the OML Notebooks template examples.","","### Prerequisites:","- An Oracle Machine Learning account with grant EXECUTE privileges on DBMS_CLOUD","","To run `DBMS_CLOUD` subprograms with a user other than ADMIN you need to grant EXECUTE privileges to that user. For example, run the following command as ADMIN to grant privileges to OMLUSER:",""," GRANT EXECUTE ON DBMS_CLOUD TO OMLUSER;","","- A public Oracle Cloud Object Storage bucket","","Copyright (c) 2024 Oracle Corporation ","###### <a href=\"https://oss.oracle.com/licenses/upl/\" target=\"_blank\">The Universal Permissive License (UPL), Version 1.0<\/a>"],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":"For more information...","message":["%md","","* <a href=\"https://docs.oracle.com/en/database/oracle/machine-learning/oml-notebooks/omlug/get-started-oracle-machine-learning.html#GUID-2AEC56A4-E751-48A3-AAA0-0659EDD639BA\" target=\"_blank\">Getting Started with Oracle Machine Learning Notebooks<\/a>","* <a href=\"https://docs.oracle.com/en/cloud/paas/autonomous-database/adbsa/load-data.html#GUID-1351807C-E3F7-4C6D-AF83-2AEEADE2F83E\" target=\"_blank\">Loading Data with Autonomous Database<\/a>","* <a href=\"https://docs.oracle.com/en-us/iaas/Content/Object/Concepts/objectstorageoverview.htm\" target=\"_blank\">Object Storage Bucket Overview<\/a>","* <a href=\"https://docs.oracle.com/en/cloud/paas/autonomous-database/adbdu/dbms_cloud-objects-and-files.html#GUID-12C083D6-76ED-4D35-8BD6-B277D91090F9\" target=\"_blank\">DBMS_CLOUD Overview<\/a>"],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":"Custom procedure for loading data ","message":["%md","","The custom procedure `IMPORT_DATA_OBJSTORE_DATASET` imports the data from Object Storage directly to a database table using a combination of PL/SQL commands and the `DBMS_CLOUD.COPY_DATA` procedure. Simply provide the dataset name, and the table will be created and the data inserted to the target table.","","#### Prerequisites:","","- The CSV file containing the data and a text file with the table definition must be saved in the Object Storage bucket","-- Note, the default Object Storage bucket already contains the CSV and text files needed to create the OML demo datasets","- Both the CSV and text files must match the name of the input dataset","-- For example, *CUSTOMER_INSURANCE_LTV2.txt* and *CUSTOMER_INSURANCE_LTV2.csv*","- The Object Storage bucket must be public","-- Either the default bucket or the user's public bucket may be used","- The OML ADMIN user must grant execute on the DBMS_CLOUD procedures to the OML user ","-- Use the command *grant execute on DBMS_CLOUD to OMLUSER*, where OMLUSER is the OML username","","#### Input variables:","- DATASET: The name of CSV data set residing in Object Storage","- VERBOSE: Optional. Controls the amount of information returned from the procedure","-- Default is FALSE, which returns the table name and number of rows in the table","-- When TRUE, returns detailed object information, including the default return values, the file names in Object Storage, and the table definition","- LOCATION_URI: Optional. The Object Storage URL where the files reside. The default value is the Oracle Machine Learning public bucket.","","#### Basic usage:","To run the procedure, pass the dataset name as the `DATASET` parameter:",""," EXEC IMPORT_DATA_OBJSTORE_DATASET('CUSTOMER_INSURANCE_LTV2')"],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"raw","title":"Define the custom procedure","message":["%script\r","\r","CREATE OR REPLACE PROCEDURE IMPORT_DATA_OBJSTORE_DATASET(\r"," DATASET IN VARCHAR2,\r"," VERBOSE IN BOOLEAN DEFAULT FALSE,\r"," LOCATION_URI IN VARCHAR2 DEFAULT 'https://objectstorage.us-ashburn-1.oraclecloud.com/n/adwc4pm/b/OML_Data/o/') AUTHID CURRENT_USER AS\r","dataset_upper VARCHAR2(100);\r","file_name VARCHAR2(1000);\r","table_name varchar2(1000);\r","obj varchar2(1000);\r","sql_file varchar2(1000);\r","b_data BLOB;\r","v_data VARCHAR2(5000);\r","v_rownum INTEGER;\r","\r","------------------------------------------------------\r","-- Declare a new variable to convert the input\r","-- dataset to upper case. \r","\r","------------------------------------------------------\r","\r","BEGIN\r"," dataset_upper := upper(dataset);\r","\r","------------------------------------------------------\r","-- Define file name from input dataset\r","------------------------------------------------------\r","\r","SELECT CONCAT(dataset_upper, '.csv') INTO FILE_NAME from dual;\r","\r","IF VERBOSE THEN\r"," DBMS_OUTPUT.PUT_LINE('Dataset name is: '||dataset_upper);\r"," DBMS_OUTPUT.PUT_LINE('CSV file name is: '||file_name);\r","END IF;\r","\r","------------------------------------------------------\r","-- Select file from Object Storage\r","------------------------------------------------------\r","\r","SELECT object_name INTO obj FROM DBMS_CLOUD.LIST_OBJECTS(NULL, LOCATION_URI) where object_name = FILE_NAME;\r","\r","------------------------------------------------------\r","-- Define table name\r","------------------------------------------------------\r","\r","SELECT REGEXP_SUBSTR(obj, '[^.]+') INTO table_name from dual;\r","\r"," DBMS_OUTPUT.PUT_LINE('Table name is: '||table_name);\r","\r","------------------------------------------------------\r","-- Define file containing the CREATE TABLE statement,\r","-- which is saved in a TXT file. DBMS_CLOUD does not\r","-- support reading SQL files.\r","------------------------------------------------------\r","\r","SELECT table_name||'.txt' INTO sql_file from dual;\r","\r","IF VERBOSE THEN\r"," DBMS_OUTPUT.PUT_LINE('Text file is: '||sql_file);\r","END IF;\r","\r","------------------------------------------------------\r","-- Read the SQL statement from the text file into\r","-- a binary variable and execute the statement\r","------------------------------------------------------\r","\r","BEGIN\r"," b_data := DBMS_CLOUD.GET_OBJECT(\r"," credential_name => NULL,\r"," object_uri => LOCATION_URI||sql_file);\r"," v_data := utl_raw.cast_to_varchar2(b_data);\r"," IF VERBOSE THEN\r"," DBMS_OUTPUT.PUT_LINE('Table definition is: '||v_data);\r"," END IF;\r"," EXECUTE IMMEDIATE v_data;\r","END;\r","\r","------------------------------------------------------\r","-- Write data to the final table specified as dataset\r","-- input. If the table already exists, an error is\r","-- returned.\r","------------------------------------------------------\r","\r","BEGIN\r"," DBMS_CLOUD.COPY_DATA (\r"," credential_name => NULL,\r"," table_name => table_name,\r"," file_uri_list => LOCATION_URI||file_name,\r"," format => json_object('type' value 'CSV', 'skipheaders' value 1));\r","END;\r","\r","------------------------------------------------------\r","-- Return the number of rows in the table\r","------------------------------------------------------\r","\r","EXECUTE IMMEDIATE 'SELECT count(*) FROM ' || table_name INTO v_rownum;\r","DBMS_OUTPUT.PUT_LINE ('Table created with number of rows = ' || v_rownum);\r","\r","END;\r"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"raw","title":"Delete the table if it exists","message":["%script","","BEGIN EXECUTE IMMEDIATE 'DROP TABLE CUSTOMER_INSURANCE_LTV2';","EXCEPTION WHEN OTHERS THEN NULL; END;"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":"Run the procedure in default mode","message":["%md","","In default mode, the only required input is the dataset name, `CUSTOMER_INSURANCE_LTV2`. The table `CUSTOMER_INSURANCE_LTV2` is created and the table name and number of rows in the table is returned."],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"raw","title":null,"message":["%script","","EXEC IMPORT_DATA_OBJSTORE_DATASET('CUSTOMER_INSURANCE_LTV2')"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"table","title":"Display the first 10 rows of the table","message":["%sql","","SELECT * ","FROM CUSTOMER_INSURANCE_LTV2 ","FETCH FIRST 10 ROWS ONLY;"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":"Run the custom procedure in verbose mode","message":["%md","","In verbose mode, additional information is returned, including the dataset, CSV, and text file names, and the table definition. Verbose mode is enabled by specified by setting the `VERBOSE`parameter to `TRUE`."],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"raw","title":"Delete the previously created table","message":["%script","","BEGIN EXECUTE IMMEDIATE 'DROP TABLE CUSTOMER_INSURANCE_LTV2';","EXCEPTION WHEN OTHERS THEN NULL; END;"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"raw","title":null,"message":["%script","","EXEC IMPORT_DATA_OBJSTORE_DATASET('CUSTOMER_INSURANCE_LTV2', TRUE)"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"table","title":"Display the first 10 rows of the table","message":["%sql","","SELECT * ","FROM CUSTOMER_INSURANCE_LTV2 ","FETCH FIRST 10 ROWS ONLY;"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":"Run the procedure using your own Object Storage bucket","message":["%md","","To use your own Object Storage bucket with `IMPORT_DATA_OBJSTORE_DATASET`, pass the bucket URI to the `LOCATION_URI` parameter. Recall that the CSV file containing the data and the text file with the table definition should first be uploaded to the bucket. The example below demonstrates this functionality in default mode, and replacing NULL for TRUE will enable verbose mode."],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":null,"title":null,"message":["%script","","EXEC IMPORT_DATA_OBJSTORE_DATASET('CUSTOMER_INSURANCE_LTV2', NULL, '<your bucket URI here>')"],"enabled":false,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":null},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":"Try it with your data","message":["%md","","To use the `IMPORT_DATA_OBJSTORE_DATASET` procedure with your own dataset, create the text file containing the table definition (in a single line), save the corresponding CSV data file to your public Object Storage bucket, and run the procedure with the dataset name as input. ","","For example, for the IRIS data, the bucket will have IRIS.csv containing the data, and IRIS.txt with table definition:",""," CREATE TABLE IRIS(SEPAL_LENGTH number, SEPAL_WIDTH number, PETAL_LENGTH number, PETAL_WIDTH number, SPECIES varchar2(4000))","","Run the procedure using your bucket:",""," EXEC IMPORT_DATA_OBJSTORE_DATASET('IRIS', NULL, '<your bucket URI>')"],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":null,"message":["%md","","## End of script"],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":null,"title":null,"message":["%md"," "],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":0,"hideResult":true,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"}],"version":"6","snapshot":false,"tags":null}]